.TH audiowaveform 5 "23 November 2018"

.SH NAME

audiowaveform \- waveform data file format

.SH DESCRIPTION

This page describes the binary data format produced by the
.BR audiowaveform (1)
program.

.SS Binary data format (.dat)

.B audiowaveform
expects binary waveform data files to use the ".dat" file extension. This format
consists of a header block followed by the actual waveform data. All
values are little-endian.

The header block starts with a version field that identifies how
the rest of the data in the file is structured.

.in +4
.nf
.na
.TS
lB lB lB
___
l l l.
Byte offset	Type	Field
0-3	int32_t	Version
.TE
.ad
.fi
.in -4

The version 1 header is structured as follows:

.in +4
.nf
.na
.TS
lB lB lB
___
l l l.
Byte offset	Type	Field
4-7	uint32_t	Flags
8-11	int32_t	Sample rate
12-15	int32_t	Samples per pixel
16-19	uint32_t	Length
.TE
.ad
.fi
.in -4

The version 2 header is structured as follows:

.in +4
.nf
.na
.TS
lB lB lB
___
l l l.
Byte offset	Type	Field
4-7	uint32_t	Flags
8-11	int32_t	Sample rate
12-15	int32_t	Samples per pixel
16-19	uint32_t	Length
20-23	int32_t	Channels
.TE
.ad
.fi
.in -4

Each of these fields is described in detail below.

.TP 4
.B Version
This field indicates the version number of the waveform data format. The version
1 and 2 data formats are described here. If the format changes in future, the
Version field will be incremented.

.TP
.B Flags
The Flags field describes the format of the waveform data that follows the
header.

.in +4
.nf
.na
.TS
lB lB
__
l l.
Bit 	Description
0 (lsb)	0: 16-bit resolution, 1: 8-bit resolution
1-31	Unused
.TE
.ad
.fi
.in -4

.TP
.B Sample rate
Sample rate of the original audio file (Hz).

.TP
.B Samples per pixel
Number of audio samples per waveform minimum/maximum pair.

.TP
.B Length
Length of waveform data (number of minimum and maximum value pairs per channel).
.PP

.TP
.B Channels
The number of waveform channels present (version 2 only).

Waveform data follows the header block and consists of pairs of minimum and
maximum values that each represent a range of samples of the original audio (the
"samples per pixel" header field).

The version 1 data format supports only a single audio channel; the
.B audiowaveform
program converts stereo audio to mono when generating
waveform data. The version 2 data format supports multiple channels, where the
data from each channel is interleaved.

For 8-bit data, the waveform data is represented as follows. Each value lies in
the range -128 to +127. The example shows a two channel waveform data file.

.in +4
.nf
.na
.TS
lB lB lB
___
l l l.
Byte offset	Type	Value
20	int8_t	Minimum sample value, index 0, channel 0
21	int8_t	Maximum sample value, index 0, channel 0
22	int8_t	Minimum sample value, index 0, channel 1
23	int8_t	Maximum sample value, index 0, channel 1
24	int8_t	Minimum sample value, index 1, channel 0
25	int8_t	Maximum sample value, index 1, channel 0
26	int8_t	Minimum sample value, index 1, channel 1
27	int8_t	Maximum sample value, index 1, channel 1
etc	...	...
.TE
.ad
.fi
.in -4

Pairs of minimum and maximum values repeat to end of file.

For 16-bit data, the waveform data is represented as follows. Each value lies in
the range -32768 to +32767. The example shows a two channel waveform data file.

.in +4
.nf
.na
.TS
lB lB lB
___
l l l.
Byte offset	Type	Value
20-21	int16_t	Minimum sample value, index 0, channel 0
22-23	int16_t	Maximum sample value, index 0, channel 0
24-25	int16_t	Minimum sample value, index 0, channel 1
25-26	int16_t	Maximum sample value, index 0, channel 1
27-28	int16_t	Minimum sample value, index 1, channel 0
29-30	int16_t	Maximum sample value, index 1, channel 0
31-32	int16_t	Minimum sample value, index 1, channel 1
33-34	int16_t	Maximum sample value, index 1, channel 1
etc	...	...
.TE
.ad
.fi
.in -4

Pairs of minimum and maximum values repeat to end of file.

.SS JSON data format (.json)

The JSON data format contains the same information as the binary format.
.B audiowaveform
expects JSON data files to use the ".json" file extension.
The format consists of a single JSON object containing fields as described below.

.TP
.B \fBversion\fR (Number)
The version number of the waveform data format. The version 1 and 2 data formats
are described here. If the format changes in future, the version field will be
incremented.

.TP
.B \fBchannels\fR (Number)
The number of waveform channels present (version 2 only).

.TP
.B \fBsample_rate\fR (Number)
Sample rate of original audio file (Hz).

.TP
.B \fBsamples_per_pixel\fR (Number)
Number of audio samples per waveform minimum/maximum pair.

.TP
.B \fBbits\fR (Number)
Resolution of waveform data. May be either 8 or 16.

.TP
.B \fBlength\fR (Number)
Length of waveform data (number of minimum and maximum value pairs).

.TP
.B \fBdata\fR (Array of Numbers)
Minimum and maximum waveform data points, interleaved.
The example shows a two channel waveform data file.
Depending on \fBbits\fR, each value may be in the range -128 to +127
or -32768 to +32767.

.PP

.in +4
.nf
.na
.TS
lB lB
___
l l.
Array offset	Value
20-21	Minimum sample value, index 0, channel 0
22-23	Maximum sample value, index 0, channel 0
24-25	Minimum sample value, index 0, channel 1
25-26	Maximum sample value, index 0, channel 1
27-28	Minimum sample value, index 1, channel 0
29-30	Maximum sample value, index 1, channel 0
31-32	Minimum sample value, index 1, channel 1
33-34	Maximum sample value, index 1, channel 1
etc	...
.TE
.ad
.fi
.in -4

.SS Example

The following is an example of a (very short) waveform data file in JSON format.

.in +4
.nf
.na
{
    "version": 2,
    "channels": 2,
    "sample_rate": 48000,
    "samples_per_pixel:" 512,
    "bits": 8,
    "length": 3,
    "data": [-65,63,-66,64,-40,41,-39,45,-55,43,-55,44]
}
.ad
.fi
.in -4

.SH SEE ALSO

.BR audiowaveform (1)

.SH AUTHOR

.UR chris@chrisneedham.com
Chris Needham
.UE
